Skip to content

GeoTransolver: Fix attention and turn off feature broadcasting.#1415

Merged
ktangsali merged 7 commits into2.0.0-rcfrom
geotransolver_attn_fix
Feb 27, 2026
Merged

GeoTransolver: Fix attention and turn off feature broadcasting.#1415
ktangsali merged 7 commits into2.0.0-rcfrom
geotransolver_attn_fix

Conversation

@coreyjadams
Copy link
Collaborator

PhysicsNeMo Pull Request

Description

Checklist

Dependencies

Review Process

All PRs are reviewed by the PhysicsNeMo team before merging.

Depending on which files are changed, GitHub may automatically assign a maintainer for review.

We are also testing AI-based code review tools (e.g., Greptile), which may add automated comments with a confidence score.
This score reflects the AI’s assessment of merge readiness and is not a qualitative judgment of your work, nor is
it an indication that the PR will be accepted / rejected.

AI-generated feedback should be reviewed critically for usefulness.
You are not required to respond to every AI comment, but they are intended to help both authors and reviewers.
Please react to Greptile comments with 👍 or 👎 to provide feedback on their accuracy.

@coreyjadams coreyjadams changed the title Fix attention and turn off feature broadcasting. GeoTransolver: Fix attention and turn off feature broadcasting. Feb 24, 2026
@coreyjadams coreyjadams changed the base branch from main to 2.0.0-rc February 24, 2026 18:51
@coreyjadams coreyjadams marked this pull request as ready for review February 24, 2026 18:52
@coreyjadams
Copy link
Collaborator Author

Rebased into rc for 2.0.0

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 24, 2026

Greptile Summary

Fixed critical attention residual connection bug in GeoTransolver and disabled feature broadcasting to match expected tensor shapes.

Key Changes:

  • Fixed residual connection in GALE_block.forward() to add attention output to original input fx instead of normalized input normed_inputs (line 460 in gale.py)
  • Disabled broadcast_global_features in config to avoid broadcasting scalar global features across all spatial points
  • Updated inference script to create scalar tensors with shape () instead of (1,) to match the non-broadcasting mode

Architecture Fix:
The attention fix is critical - the previous code attn[i] + normed_inputs[i] was adding the attention output to the layer-normalized input, which breaks the pre-norm residual architecture. The corrected version attn[i] + fx[i] properly implements: output = Attention(LayerNorm(input)) + input.

Important Files Changed

Filename Overview
physicsnemo/experimental/models/geotransolver/gale.py Fixed residual connection bug in GALE_block - now correctly adds attention output to original input fx instead of normalized input normed_inputs
examples/cfd/external_aerodynamics/transformer_models/src/inference_on_vtk.py Changed scalar parameters from shape (1,) to shape () to match disabled broadcast_global_features mode
examples/cfd/external_aerodynamics/transformer_models/src/conf/geotransolver_surface.yaml Disabled broadcast_global_features flag to avoid broadcasting scalars across all points

Last reviewed commit: 465d9c2

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@coreyjadams
Copy link
Collaborator Author

/blossom-ci

@coreyjadams
Copy link
Collaborator Author

/blossom-ci

@ktangsali ktangsali merged commit 6194f04 into 2.0.0-rc Feb 27, 2026
1 check passed
@ktangsali ktangsali deleted the geotransolver_attn_fix branch February 27, 2026 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants